A Hybrid Approach fior Mining Maximal Hyperclique Patterns
نویسندگان
چکیده
fi A hyperclique pattern [12] is a new type of association pattern that contains items which are highly affiliated with each other. More specifically, the presence of an item in one transaction strongly implies the presence of every other item that belongs to the same hyperclique pattern. In this paper, we present a new algorithm for mining maximal hyperclique patterns, which are desirable for pattern-based clustering methods [11]. This algorithm exploits key advantages of both the Depth First Search (DFS) strategy and the Breadth First Search (BFS) strategy. Indeed, we adapt the equivalence pruning method, one of the most efficient pruning methods of the DFS strategy, into the process of the BFS strategy. As demonstrated by our experimental results, the performance of our algorithm can be orders of magnitude faster than standard maximal frequent pattern mining algorithms, particularly at low levels of support.
منابع مشابه
Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results
Hyperclique patterns are groups of objects which are strongly related to each other. Indeed, the objects in a hyperclique pattern have a guaranteed level of global pairwise similarity to one another as measured by uncentered Pearson’s correlation coefficient. Recent literature has provided the approach to discovering hyperclique patterns over data sets with binary attributes. In this paper, we ...
متن کاملMining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution
Existing association-rule mining algorithms often rely on the support-based pruning strategy to prune its combinatorial search space. This strategy is not quite effective for data sets with skewed support distributions because they tend to generate many spurious patterns involving items from different support levels or miss potentially interesting low-support patterns. To overcome these problem...
متن کاملIdentification of Functional Modules in Protein Complexes via Hyperclique Pattern Discovery
Proteins usually do not act isolated in a cell but function within complicated cellular pathways, interacting with other proteins either in pairs or as components of larger complexes. While many protein complexes have been identified by large-scale experimental studies, due to a large number of false-positive interactions existing in current protein complexes 10, it is still difficult to obtain...
متن کاملWIP: mining Weighted Interesting Patterns with a strong weight and/or support affinity
In this paper, we present a new algorithm, Weighted Interesting Pattern mining (WIP) in which a new measure, weight-confidence, is developed to generate weighted hyperclique patterns with similar levels of weights. A weight range is used to decide weight boundaries and an h-confidence serves to identify strong support affinity patterns. WIP not only gives a balance between the two measures of w...
متن کاملHybrid ASP-Based Approach to Pattern Mining
Detecting small sets of relevant patterns from a given dataset is a central challenge in data mining. The relevance of a pattern is based on userprovided criteria; typically, all patterns that satisfy certain criteria are considered relevant. Rule-based languages like Answer Set Programming (ASP) seem wellsuited for specifying such criteria in a form of constraints. Although progress has been m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006